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5LMMARY 

Many cnidarians utilize green-fluorescent proteins (GFPs) as energy-transfer acceptors in bioiuminescence. GFPs flu- 
oresce in vivo upon receiving energy from either a luciferase-oxyiuciferin excited-state complex or a Ca 2 * -activated pbo- 
'.oprotein. Tnese highly fluorescent proteins are unique due to the chemical nature of their chromophore, which is comprised 
ji modified amino acid (aa)residues within the polypeptide. This report describes the cloning and sequencing of both cDNA 
and genomic clones of GFP from the cnidarian, Aequorea victoria. The gfplO cDNA encodes a 238-aa-residue polypeptide 
with a calculated M T of 26888. Comparison of A. victoria GFP genomic clones sfeows three different restriction enzyme 
patterns which suggests that at least three different genes are present in the A. victoria population a: Friday Harbor. 
Washington. The gfp gene encoded by the AGFP2 genomic done is comprised of at least three exons spread over 2.6 kb. 
T.;e nucleotide sequences of the cDNA and the gene will aid in the elucidation of structure-function relationships in this 
unique class of proteins. 



••■ :ntroduction 

luminescence is common in a variety of marine inver- 
tebrates. Many cnidarians and probably all ctenophores 
::ait light when. mechanically disturbed Proteins respon- 
iible for bioiuminescence from several species of these two 

...r . 

■ ?d? Correspondence to: Dr. D.C. Prashcr. Redfield Bldg., Woods Hole Oceano- 

graphic Institution. Woods Hole, MA 02543 (U.S.A.) 
f * Tel. (508)457-2000. ext. 2311; Fax (508)457.2195. 

Abbreviations: A.. Aequorea: aa. amino acid(s): bp.'base pair(s); GFP. 
Srecn-rtuorescent protein: &p. DNA or RNA encoding GFP; kb. kilo- 
tosefs) or 1000 bp: nt. nucleotides); oiigo, oiigodeoxyribonucleoiidc: 
ORF, open reading frames). 



phyla have been characterized. Light from l umin escent cni- 
daria is primarily green whereas light emitted from cteno- 
phores is blue. The green light of cnidaria is due to the 
presence of a class of proteins called green-fluorescent pro- 
teins (GFPs). They are highly fluorescent and are activated 
in vivo by an energy transfer process via a luciferase or a 
Ca~ " -activated photoprotein, both of which produce en- 
ergy during the oxidation of coelenterate-type luciferin. In 
the cnidarian Aequorea, the photoprotein aequorin excites 
the GFP by an unknown mechanism to release green light. 
Previous studies suggesting that Aequorea GFP is stimu- 
lated via a radiationless mechanism (Morise etal., 1974) 
have been questioned (Ward, 1979). The GFP from Re- 
nilla, another cnidarian, on the other hand, clearly receives 
energy from the Renilla luciferase-oxyiuciferin excited state 
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complex by a radiaiionless energy transfer mechanism 
(Ward and Cormier. 19761. 

The GFPs most thoroughly studied have been isolated 
>om Aequorea and Renilla (Ward, 1979). The Aequorea 
GFP has been reported to be a 30-kDa monomer (Pren- 
dergast and Mann. 1978) whereas the Renilla GFP is a 
54-kDa homodimer (Ward and Cormier. 1979). The two 
proteins have differed absorption spectra but identical 
emission spectra (/^ = 509 nm). Upon denamrarion the 
two GFPs have the same absorption spectra. Ward et al. 
( 1 980) have predicted that both Aequorea and Renilla GFPs 
contain chromophores having the same structure but that 
the different absorption spectra are explained by different 
apoprotein environments. 

Biochemical properties of the Aequorea GFP show it to 
have unique structural properties. The fluorescent chro- 
mophore is stable to a variety of harsh conditions includ- 
ing heat, extreme pH. and chemical denaturants. Fluores- 
cence is lost, for example, to base or acid treatment or 
addition of guanidine hydrochloride, but upon neutraliza- 
tion of the pH or removal of the denaturanu fluorescence 
returns with an identical emission spectrum (Bokman and 
Ward, 1981: Ward and Bokman, 1982). Tne chromophore 
structure is very different from those of the phycobilipro- 
teins which are also highly fluorescent. The chromophore 
in the GFPs is covaienUy bound and is formed by modi- 
fication of certain aa residues within the polypeptide. The 
•hemica! structure of the Aequorea GFP chromophore 
(Fig. 1) ? first characterized by Shimomura (1979), has been 
thoroughly re-examined (Ward et aL 1989; W.W.W.. un- 
published j and is shown hers (Fig. 1) it. iis revised form. 
In this study, the Aequorea GFP gene and its cDNA have 
been isolated and characterized in' pursuit of elucidating the 
mechanism of energy transfer between aequorin and GFP 
as well as addressing evolutionary relationships in coelen- 
terate bioluninescence. 
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Fig. 1. The chemical structure of the chromophore in Aequorea GFP 
'W.W.W.. unpublished). The cyclized chromophore is formed from the 
nmer Ser-dehydroTyr-Gly within the polypeptide by an unknown mech- 
anism. 



EXPERIMENTAL AND DISCUSSION 

(a) Construction of cDNA libraries 

An A. victoria cDNA library, constructed in pBr.;:- 
(Prasher eiaL 1985), was screened for the presence of c 
gfp cDNA using two oligo mixtures whose sequences \ver» 
based on the aa sequences derived from GFP-dcriv C c 
CNBr fragments. Tne oiigos contained the followinc a - 
sequences: A: 5 ' -AAgAAgTCgTG?TG?ITCAT (20-Ve: 
with 32 redundancies). B: 5 -TT£TaSTT^TaSTCCa7 
(17-mer with 16 redundancies ). Tne hybridization of :h- 
3 "P-labeled mixtures A and B to repiicate filters contain-;: 
this library were performed according to the meth \. •* 
Wood et al. (1 985) utilizing letramethylammonium chjonae 
during the washing steps. The temperatures used during the 
washing steps for mixtures A and B were 55 C C and 50 : C 
respectively. 

A single gfp cDNA was isolated from the library by ihj; 
method. This clone, pGFPl, contained a Pstl inser: of 
511 bp having an ORF encoding 168 aa. Tne deduced 
translation of the nt sequence indicated the gfp] cD % 'A 
lacked both the 5'- and 3 '-sequences of the coding re,: 
However, the sequence FSYGVQ within the deduced 
translation permitted the chromophore structure to be de- 
ciphered (W.W.W.. unpublished). Upon rescreening the 
library with gfpl cDNA. no additional cDNAs were found. 

A seconds. Victoria cDNA library was constructed (Gu- 
bier and Hoffman, 1983) in AgtlO (Huynh e; al., I985.I. The 
Pstl insert from gfpl cDNA was used as a hybridization 
probe against the entire AgriO library of 1.4 x 30 c re:-r> 
oinant phage. No gfp-ct\x*zi recombinant -.vera \±y.\\ :i 
upon screening the primary library. The phage remaining 
on the plates were extracted from the top agar and usee a* 
an amplified library (Maniatis etal., 1982). Upon screen- 
ing this preparation of the library, four recombinants hy- 
bridized to the gfpl cDNA following their purification. The 
four cDNA clones were designated AGFPlO. 11. 12. and 
13. All four recombinants were shown to contain an inser. 
of 1 kb upon digestion with EcoRL 

(b) Characterization of the gfplO cDNA 

The entire EcoKl insert of AGFP10 was sequenced 
(Fig. 2). Limited nt sequences obtained from /LGFPl 1 an- 
12 were identical with that from AGFP10 suggesting ihs: 
they were siblings and. hence, were not sequenced further- 
Even though the enure coding region appears to be present 
(see below), three features of the cDNA insert of /.GFP^ ! 
suggest it is not quite full-length. First, the cDNA \- °' ; n: 
where the gfp mRNA is 1.05 kb in length as deiernwv. . i 1 ) 
Northern analysis (Fig. 3). Second, the S'-untran*:^ 
region is very short. Third, no poIy(A) track is observed in 
the gfplO cDNA sequence (Fig. 2) despite the presence" oi 1 
the gfp mRNA in only the poly(A)" RN A fraction of A • » 
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•ona RNA (data not shown). A typical polyadenylation 
signal is located at nt 861-865 (Fig. 2). 

The nt sequence of the gfplO cDNA contains an ORF 
encoding a 238-aa protein having a calculated \f T of 26 888. 
r..l3 compares favorably with 30kDa for nauve GFP as 
determined by denaturing electrophoresis (Prendergast 
aid Mann,j978). The deduced translauon contains aa 
sequences of numerous peptides isolated from native GFP 
5 underlined in Fig. 2). When compared to the gfpiO cDNA 
sequence (Fig. 2), the gfpl cDNA was determined to en- 
code aa residues 28-195. Oligo mixture A is complimen- 
•ary to the codons encoding aa 78-84 and mixture B is 
complimentary to the codons encoding aa 141-146 (Fig. 2). 
pic trimer Ser-Tyr-Gly, modified in the native protein to 
'orm the chromophore (W.W.W., unpublished), is located 
at aa 65-67. The chromophore consists of an imidazoione 
nng formed by the residues Ser-dehydroTyr-Gly within the 



polypeptide (Fig. 1). Located 8 aa upstream of this chro- 
mopeptide is GFP's only Trp. The inability to detect the 
fluorescence from this Trp makes it unusual (W.W.W., 
unpublished). Perhaps energy-transfer occurs between it 
and the chromophore in the native protein preventing the 
Trp fluorescence (320-350 nm). The Trp is flanked bv sev- 
eral Pro residues (Pro-Val-Pro-Trp-Pro). The significance 
of this pentapepude is not understood but a search of the 
protein databases (PIR ver 25; Swiss-Prot ver 14) shows 
it to be present only in cytochrome P-450 proteins. 

(c) Isolation and characterization of gfp genomic clones 

The gfpl cDNA was also used to isolate genomic clones 
prior to the availability of the gfplO cDNA. An A. victoria 
genomic library was constructed in a2001 (Karn etaL 
1984) essentially as described (Maniatis et al., 1982). Eight 
recombinant phages hybridizing to the gfpl cDN'A were 
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maps of three representative genomic clones are com£ef 1 douW 
unes represent those DNa fragments which hybridize to £ c dSa 

^aenLn ^ ? ' «*" (B)T <* «on/intron ar- 

«m of the gene encoded b> ;.GFPI was detenaned bv coap^- 
mg them sequences of the 5-kb £mRl rt m ,w.^.w , ' pt " 
HMIU fr apn mls of /GPP- ^ a fh / w ° veri *PP"g 1Mb 

agents oi /.(jhP. ana we £c 0 RI msen of xGFPIO cDNA 
The «ons are represented by the blackened bo«s. I. II an d III The 
GenBacK accession No. for the ff >: sequecce is M6265 , ^ ^ 

purged from the geaoaic DNa library. Based on restric- 
jn enzyme and Southerr.-biot analyses, they represent six 

afferent isolate; hawag a: teas: three different restriction 
maps (Fig. 4,. Waen DNa fragments from the 5'- and 
- -enos oi the gfp] cDNA were used as hvbridtzatior. 
prooes. all of the genomic clones were found Iikeiv to con- 
tain the 5 -end of the gene, but only gf p2 . 3. and 9 also 
contained tne 3 ■ end. The three types of genomic clones are 
consistent with the presence of multiple GFP isoforms iso- 
lated from A. victoria (A. Roth. M. Cutler and W W W 
unpublished). Since the A. victoria genomic DNA used for 
the genomic library was isolated from a larse number of 
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jellyfish (collected a; Friday Harbor. Washington i th- , h «- 
SP genes are representative of the Aequorea population* 
opposed to individual jellyfish. 

The EcoW-BamKl and an overlapping HindlU fo- 
ments in the genomic clone /GFP2 (Fig. 4) were sequenced 
ana compared to thai of the gfplQ cDNA to examine th- 
structure of the gene. The gfp gene encoded bv fc^ 
contains at least three exons spread over 2.6 kb 
(Fig. 4). These exons, designated II. III. and IV. encodl- 
98. and /I aa. respectively. Presumably, a fourth exor ,J 
located upstream from the genome since the 15 nt at the < " 
eno of the gfplO cDNA sequence cannot be aiianed to in* 
1 r< ^ on of :he DNA sequence derived from the" ^ „^ 
me positions of the introns with resoec: to the cDN \ 
sequence are indicated (Fig. 2). The aa residues involved in 
the chromophore are encoded^ the 3' end of exon I' 
nt sequences of the gfp mRNA snlice junction 
scnably well with consensus sequences (Pi- " , 

The gfplO cDNA is not encoded by the ^jVgene since ; 
there are several ni differences between their seouences. 
The nt differences within the protein-coding resnons are , 
summarized in Table IA. Four of the 12 sinsne nt differ- ' 
«ces result in conservative aa replacements "at position* 
100. 108, 141 and 219 (Table IB). The aa residues encoded 
at these four positions are consistent with the aa leoucr..-. 
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observed in GFP-derived peptides which showed a Tvr at 
vsiuon 100. a Met at position 141. but a Thr at position 
;t)S. Eight additional nt differences occur with the gfp2 gene 
.:, :hc 3'-non-transIated region of the gfpW cDNA (data 
: o. -nown). It is not known whether the gfplO cDNA 
-presents an allele of gf p 2 or another gfp gene. 

These results will enable us to construct an expression 
.scior tor the preparation of non-fluorescent apoGFP 
Since no information is yet available regarding the biosyn- 
jesis of the chromophore, a recombinant form of this 
rroiem will be a valuable reagent with which to examine the 
nochemistry of chromophore formation in this unique class 
jr -rotems and the mechanism of energy transfer between 
ic^orin and GFP. 
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